465 research outputs found

    A latent variable ranking model for content-based retrieval

    Get PDF
    34th European Conference on IR Research, ECIR 2012, Barcelona, Spain, April 1-5, 2012. ProceedingsSince their introduction, ranking SVM models [11] have become a powerful tool for training content-based retrieval systems. All we need for training a model are retrieval examples in the form of triplet constraints, i.e. examples specifying that relative to some query, a database item a should be ranked higher than database item b. These types of constraints could be obtained from feedback of users of the retrieval system. Most previous ranking models learn either a global combination of elementary similarity functions or a combination defined with respect to a single database item. Instead, we propose a “coarse to fine” ranking model where given a query we first compute a distribution over “coarse” classes and then use the linear combination that has been optimized for queries of that class. These coarse classes are hidden and need to be induced by the training algorithm. We propose a latent variable ranking model that induces both the latent classes and the weights of the linear combination for each class from ranking triplets. Our experiments over two large image datasets and a text retrieval dataset show the advantages of our model over learning a global combination as well as a combination for each test point (i.e. transductive setting). Furthermore, compared to the transductive approach our model has a clear computational advantages since it does not need to be retrained for each test query.Spanish Ministry of Science and Innovation (JCI-2009-04240)EU PASCAL2 Network of Excellence (FP7-ICT-216886

    Semi-supervised prediction of protein interaction sentences exploiting semantically encoded metrics

    Get PDF
    Protein-protein interaction (PPI) identification is an integral component of many biomedical research and database curation tools. Automation of this task through classification is one of the key goals of text mining (TM). However, labelled PPI corpora required to train classifiers are generally small. In order to overcome this sparsity in the training data, we propose a novel method of integrating corpora that do not contain relevance judgements. Our approach uses a semantic language model to gather word similarity from a large unlabelled corpus. This additional information is integrated into the sentence classification process using kernel transformations and has a re-weighting effect on the training features that leads to an 8% improvement in F-score over the baseline results. Furthermore, we discover that some words which are generally considered indicative of interactions are actually neutralised by this process

    Machine Learning in Automated Text Categorization

    Full text link
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

    Determination of step--edge barriers to interlayer transport from surface morphology during the initial stages of homoepitaxial growth

    Full text link
    We use analytic formulae obtained from a simple model of crystal growth by molecular--beam epitaxy to determine step--edge barriers to interlayer transport. The method is based on information about the surface morphology at the onset of nucleation on top of first--layer islands in the submonolayer coverage regime of homoepitaxial growth. The formulae are tested using kinetic Monte Carlo simulations of a solid--on--solid model and applied to estimate step--edge barriers from scanning--tunneling microscopy data on initial stages of Fe(001), Pt(111), and Ag(111) homoepitaxy.Comment: 4 pages, a Postscript file, uuencoded and compressed. Physical Review B, Rapid Communications, in press

    Pseudo Goldstone Bosons Phenomenology in Minimal Walking Technicolor

    Full text link
    We construct the non-linear realized Lagrangian for the Goldstone Bosons associated to the breaking pattern of SU(4) to SO(4). This pattern is expected to occur in any Technicolor extension of the standard model featuring two Dirac fermions transforming according to real representations of the underlying gauge group. We concentrate on the Minimal Walking Technicolor quantum number assignments with respect to the standard model symmetries. We demonstrate that for, any choice of the quantum numbers, consistent with gauge and Witten anomalies the spectrum of the pseudo Goldstone Bosons contains electrically doubly charged states which can be discovered at the Large Hadron Collider.Comment: 25 pages, 5 figure

    Racial discrimination and depressive symptoms among African-American men: The mediating and moderating roles of masculine self-reliance and John Henryism

    Get PDF
    Despite well-documented associations between everyday racial discrimination and depression, mechanisms underlying this association among African-American men are poorly understood. Guided by the Transactional Model of Stress and Coping, we frame masculine self-reliance and John Henryism as appraisal mechanisms that influence the relationship between racial discrimination, a source of significant psychosocial stress, and depressive symptoms among African-American men. We also investigate whether the proposed relationships vary by reported discrimination-specific coping responses. Participants were 478 African-American men recruited primarily from barbershops in the West and South regions of the United States. Multiple linear regression and Sobel-Goodman mediation analyses were used to examine direct and mediated associations between our study variables. Racial discrimination and masculine self-reliance were positively associated with depressive symptoms, though the latter only among active responders. John Henryism was negatively associated with depressive symptoms, mediated the masculine self-reliance- depressive symptom relationship, and among active responders moderated the racial discrimination-depressive symptoms relationship. Though structural interventions are essential, clinical interventions designed to mitigate the mental health consequences of racial discrimination among African-American men should leverage masculine self-reliance and active coping mechanisms

    Classification of protein interaction sentences via gaussian processes

    Get PDF
    The increase in the availability of protein interaction studies in textual format coupled with the demand for easier access to the key results has lead to a need for text mining solutions. In the text processing pipeline, classification is a key step for extraction of small sections of relevant text. Consequently, for the task of locating protein-protein interaction sentences, we examine the use of a classifier which has rarely been applied to text, the Gaussian processes (GPs). GPs are a non-parametric probabilistic analogue to the more popular support vector machines (SVMs). We find that GPs outperform the SVM and na\"ive Bayes classifiers on binary sentence data, whilst showing equivalent performance on abstract and multiclass sentence corpora. In addition, the lack of the margin parameter, which requires costly tuning, along with the principled multiclass extensions enabled by the probabilistic framework make GPs an appealing alternative worth of further adoption

    The Dynamics of a Rigid Body in Potential Flow with Circulation

    Get PDF
    We consider the motion of a two-dimensional body of arbitrary shape in a planar irrotational, incompressible fluid with a given amount of circulation around the body. We derive the equations of motion for this system by performing symplectic reduction with respect to the group of volume-preserving diffeomorphisms and obtain the relevant Poisson structures after a further Poisson reduction with respect to the group of translations and rotations. In this way, we recover the equations of motion given for this system by Chaplygin and Lamb, and we give a geometric interpretation for the Kutta-Zhukowski force as a curvature-related effect. In addition, we show that the motion of a rigid body with circulation can be understood as a geodesic flow on a central extension of the special Euclidian group SE(2), and we relate the cocycle in the description of this central extension to a certain curvature tensor.Comment: 28 pages, 2 figures; v2: typos correcte
    corecore